Picture for Xiangyu Tony Zhang

Xiangyu Tony Zhang

StepAudio 2.5 Technical Report

Add code
May 22, 2026
Viaarxiv icon

DuplexSLA: A Full-Duplex Spoken Language Model with Synchronized Speech, Language, and Action

Add code
May 20, 2026
Viaarxiv icon

Boosting Omni-Modal Language Models: Staged Post-Training with Visually Debiased Evaluation

Add code
May 13, 2026
Viaarxiv icon

Step-Audio-R1.5 Technical Report

Add code
Apr 28, 2026
Viaarxiv icon

Step-Audio 2 Technical Report

Add code
Jul 24, 2025
Figure 1 for Step-Audio 2 Technical Report
Figure 2 for Step-Audio 2 Technical Report
Figure 3 for Step-Audio 2 Technical Report
Figure 4 for Step-Audio 2 Technical Report
Viaarxiv icon